Record Counting in Historical Handwritten Documents with Convolutional Neural Networks

نویسندگان

  • Samuele Capobianco
  • Simone Marinai
چکیده

In this paper, we investigate the use of Convolutional Neural Networks for counting the number of records in historical handwritten documents. With this work we demonstrate that training the networks only with synthetic images allows us to perform a near perfect evaluation of the number of records printed on historical documents. The experiments have been performed on a benchmark dataset composed by marriage records and outperform previous results on this dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling

The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images...

متن کامل

Document Binarization Combining with Graph Cuts and Deep Neural Networks

Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of the appearance of the text, as well as the b...

متن کامل

Historical Document Binarization Combining Semantic Labeling and Graph Cuts

Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of the appearance of the text, as well as the b...

متن کامل

Recognizing Challenging Handwritten Annotations with Fully Convolutional Networks

This paper introduces a very challenging dataset of historic German documents and evaluates Fully Convolutional Neural Network (FCNN) based methods to locate handwritten annotations of any kind in these documents. The handwritten annotations can appear in form of underlines and text by using various writing instruments, e.g., the use of pencils makes the data more challenging. We train and eval...

متن کامل

Identifying Cross-Depicted Historical Motifs

Cross-depiction is the problem of identifying the same object even when it is depicted in a variety of manners. This is a common problem in handwritten historical documents image analysis, for instance when the same letter or motif is depicted in several different ways. It is a simple task for humans yet conventional heuristic computer vision methods struggle to cope with it. In this paper we a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1610.07393  شماره 

صفحات  -

تاریخ انتشار 2016